Research question 1 - What is the difference in the technology adoption between Western and non-Western countries?

Reviews will be grouped by area and host in each country. The Gini coefficient will be used to measure inequality which will allow understanding the scale of concentration of Airbnb use in Western and non-Western countries and its evolution over time. The Gini coefficient is usually defined based on Lorenz curve as the ratio of the area which lies between the line of equality and the Lorenz curve over the total area under the line of equality. The coefficient ranges from 0 to 1 where 0 represents perfect equality and 1 represents perfect inequality (Dorfman, 1979).

Table of Contents

Setup

Imports

In [175]:
import numpy as np
import pandas as pd
import geopandas as gpd

import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from matplotlib.colors import ListedColormap
%matplotlib inline
import seaborn as sns

from scipy import stats
import mapclassify as mp
import datetime

import warnings
warnings.filterwarnings('ignore')

Styles

In [11]:
def set_plot_styles(styles):
    mpl.rcParams.update(mpl.rcParamsDefault)
    plt.style.use(styles)
    
set_plot_styles(['mplstyle.config'])
In [104]:
reviews_colors = ['#adcbe3', '#4b86b4', '#2a4d69']
reviews_color_palette = sns.color_palette(reviews_colors)

def reviews_colors_map(values):
    if set(values) == set([0, 1, 2]):
        return ListedColormap(reviews_colors)
    if set(values) == set([0]):
        return ListedColormap([reviews_colors[0]])
    if set(values) == set([1]):
        return ListedColormap([reviews_colors[1]])
    if set(values) == set([2]):
        return ListedColormap([reviews_colors[2]])
    if set(values) == set([0, 1]):
        return ListedColormap([reviews_colors[0], reviews_colors[1]])
    if set(values) == set([0, 2]):
        return ListedColormap([reviews_colors[0], reviews_colors[2]])
    if set(values) == set([1, 2]):
        return ListedColormap([reviews_colors[1], reviews_colors[2]])

listgins_colors = ['#ffdbac', '#e0ac69', '#8d5524']
listings_color_palette = sns.color_palette(['#ffdbac', '#e0ac69', '#8d5524'])

def listings_colors_map(values):
    if set(values) == set([0, 1, 2]):
        return ListedColormap(listgins_colors)
    if set(values) == set([0]):
        return ListedColormap([listgins_colors[0]])
    if set(values) == set([1]):
        return ListedColormap([listgins_colors[1]])
    if set(values) == set([2]):
        return ListedColormap([listgins_colors[2]])
    if set(values) == set([0, 1]):
        return ListedColormap([listgins_colors[0], listgins_colors[1]])
    if set(values) == set([0, 2]):
        return ListedColormap([listgins_colors[0], listgins_colors[2]])
    if set(values) == set([1, 2]):
        return ListedColormap([listgins_colors[1], listgins_colors[2]])

Read datasets English reviews, non English reviews and listings

In [13]:
western_df_english = pd.read_pickle('western_df_english.pkl')
non_western_df_english = pd.read_pickle('non_western_df_english.pkl')
In [14]:
western_df_non_english = pd.read_pickle('western_df_non_english.pkl')
non_western_df_non_english = pd.read_pickle('non_western_df_non_english.pkl')
In [15]:
listings_western_df = pd.read_pickle('listings_western_df.pkl')
listings_non_western_df = pd.read_pickle('listings_non_western_df.pkl')

Concatenate English and non English comments to one data frame

In [16]:
western_comments = pd.concat([western_df_english, western_df_non_english]).reset_index(drop=True)
non_western_comments = pd.concat([non_western_df_english, non_western_df_non_english]).reset_index(drop=True)

Geographical adoption analysis

Western and Non-Western countries analysis definitions

Filter cities from dfs definition

In [161]:
def filter_city(df, city):
    return df[df['city'] == city]

Read geojson files definition

In [162]:
def read_geo_file(file):
    return gpd.read_file(file).drop('neighbourhood_group', axis=1)

Group comments and listings by neighbourhood definitions

In [26]:
def group_comments_by_neighbourhood(comments_df, col_name):
    return comments_df.groupby(['neighbourhood_cleansed'])['comments'].count().reset_index().rename(
                                                                               columns={'comments':col_name})
In [27]:
def group_listings_by_neighbourhood(listings_df):
    return listings_df.groupby(['neighbourhood_cleansed'])['id'].count().reset_index().rename(
                                                                        columns={'id':'listings_count'})

Merge English, non English comments and listings to one df definition

In [28]:
def merged_comments_listings_by_neighbourhood(comments_df_english, comments_df_non_english, listings_df):
    comments_english_by_neighbourhood = group_comments_by_neighbourhood(comments_df_english, 'comments_english_count')
    comments_non_english_by_neighbourhood = group_comments_by_neighbourhood(comments_df_non_english, 'comments_non_english_count')
    comments_by_neighbourhood_merged = pd.merge(comments_english_by_neighbourhood, comments_non_english_by_neighbourhood, on='neighbourhood_cleansed', how='inner')
    listings_by_neighbourhood = group_listings_by_neighbourhood(listings_df)
    return pd.merge(comments_by_neighbourhood_merged, listings_by_neighbourhood, on='neighbourhood_cleansed', how='inner')

Sum English and non English comments definition

In [29]:
def all_comments_calculate(df):
    df['comments_count'] = df.apply(lambda row: (row['comments_english_count'] + row['comments_non_english_count']), axis=1)

Merge dfs and compute sum definition

In [30]:
def merge_calculate_sum_by_neighbourhood(comments_df_english, comments_df_non_english, listings_df):
    comments_listings_by_neighbourhood = merged_comments_listings_by_neighbourhood(comments_df_english, comments_df_non_english, listings_df)
    all_comments_calculate(comments_listings_by_neighbourhood)
    return comments_listings_by_neighbourhood

Join comments and listings with map definition

In [31]:
def join_comments_by_neighbourhood_with_map(map_geodf, comments_by_neighbourhood_df, col_name):
    comments_on_map = map_geodf.set_index('neighbourhood').join(comments_by_neighbourhood_df.set_index('neighbourhood_cleansed')).reset_index().rename(columns={'index':'neighbourhood'})
    comments_on_map[col_name] = comments_on_map[col_name].fillna(0)
    return comments_on_map
In [32]:
def join_listings_by_neighbourhood_with_map(map_geodf, listings_by_neighbourhood_df):
    listings_on_map = map_geodf.set_index('neighbourhood').join(listings_by_neighbourhood_df.set_index('neighbourhood_cleansed')).reset_index().rename(columns={'index':'neighbourhood'})
    listings_on_map['listings_count'] = listings_on_map.listings_count.fillna(0)
    return listings_on_map

Join merged comments and listings with map definition

In [33]:
def join_comments_and_listings_by_neighbourhood_with_map(map_geodf, comments_listings_by_neighbourhood):
    comments_listings_on_map = map_geodf.set_index('neighbourhood').join(comments_listings_by_neighbourhood.set_index('neighbourhood_cleansed')).reset_index().rename(columns={'index':'neighbourhood'}).fillna(0)
    return comments_listings_on_map

Convert df to geodf definition

In [34]:
def as_geodf(df):
    return gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.longitude, df.latitude))

Assign bins definition

In [136]:
def assign_bin(value, bins):
    for i, bin_max_value in enumerate(bins):
        if value < bin_max_value:
            return i
    return len(bins) - 1

def assign_bins(df, comments_bins, listings_bins):
    df['comments_english_count_bin'] = df.apply(lambda row: assign_bin(row['comments_english_count'], comments_bins), axis = 1)
    df['comments_non_english_count_bin'] = df.apply(lambda row: assign_bin(row['comments_non_english_count'], comments_bins), axis = 1)
    df['comments_count_bin'] = df.apply(lambda row: assign_bin(row['comments_count'], comments_bins), axis = 1)
    df['listings_count_bin'] = df.apply(lambda row: assign_bin(row['listings_count'], listings_bins), axis = 1)

Plot comments definition

In [133]:
color = sns.color_palette('tab20')

def plot_comments_on_map1(comments_and_maps, xsize, ysize, df):
    fig, axs = plt.subplots(xsize, ysize, figsize=(15, 10))
    i = 0
    j = 0
    for a in comments_and_maps:
        xlim = None
        ylim = None
        if len(a) == 5:
            [map_geodf, comments_geodf, city, xlim, ylim] = a
        else:
            [map_geodf, comments_geodf, city] = a
        plot_comments_on_map2(axs[i, j], fig, map_geodf, comments_geodf, city, xlim, ylim)
        j = j + 1
        if j > (xsize - 1): 
            i = i + 1
            j = 0
In [134]:
color = sns.color_palette('tab20')

def plot_comments_on_map2(ax, fig, map_geodf, comments_geodf, city, xlim=None, ylim=None):
    ax.axis('off')
    if xlim:
        ax.set_xlim(xlim)
    if ylim:
        ax.set_ylim(ylim)
    ax.set_title('Number of reviews by neighbourhood in ' + city)
    map_geodf.plot(ax=ax, color=color[0], edgecolor='white')
    comments_geodf.plot(ax=ax, marker='.', markersize=1, color='yellow')
    plt.tight_layout()
    plt.show();

Plot listings definition

In [139]:
color = sns.color_palette('tab20')

def plot_listings_on_map1(listings_and_maps, xsize, ysize, df):
    fig, axs = plt.subplots(xsize, ysize, figsize=(15, 10))
    i = 0
    j = 0
    for a in listings_and_maps:
        xlim = None
        ylim = None
        if len(a) == 5:
            [map_geodf, listings_geodf, city, xlim, ylim] = a
        else:
            [map_geodf, listings_geodf, city] = a
        plot_listings_on_map2(axs[i, j], fig, map_geodf, listings_geodf, city, xlim, ylim)
        j = j + 1
        if j > (xsize - 1): 
            i = i + 1
            j = 0
In [140]:
color = sns.color_palette('tab20')

def plot_listings_on_map2(ax, fig, map_geodf, listings_geodf, city, xlim=None, ylim=None):
    ax.axis('off')
    if xlim:
        ax.set_xlim(xlim)
    if ylim:
        ax.set_ylim(ylim)
    ax.set_title('Number of listings by neighbourhood in ' + city)
    map_geodf.plot(ax=ax, color=color[0], edgecolor='white')
    listings_geodf.plot(ax=ax, marker='.', markersize=3, color='yellow')
    plt.tight_layout()
    plt.show();

Plot comments by neighbourhood definition

In [143]:
def plot_comments_by_neighbourhood_on_map1(comments_by_neighbourhood, xsize, ysize, df, col_name):
    fig, axs = plt.subplots(xsize, ysize, figsize=(15, 10))
    color = 'Oranges'
    sm = plt.cm.ScalarMappable(cmap=plt.cm.get_cmap(color, 5), norm=plt.Normalize(vmin=0, vmax=group_comments_by_neighbourhood(df, col_name)[col_name].max()))
    sm._A = []
    cbar = fig.colorbar(sm)
    #cbar.ax.tick_params()
    i = 0
    j = 0
    for a in comments_by_neighbourhood:
        xlim = None
        ylim = None
        if len(a) == 4:
            [comments_geodf, city, xlim, ylim] = a
        else:
            [comments_geodf, city] = a
        plot_comments_by_neighbourhood_on_map2(axs[i, j], fig, color, comments_geodf, city, col_name, xlim, ylim)
        j = j + 1
        if j > (xsize - 1): 
            i = i + 1
            j = 0
In [144]:
def plot_comments_by_neighbourhood_on_map2(ax, fig, color, comments_by_neighbourhood_geodf, city, col_name, xlim=None, ylim=None):
    ax.axis('off')
    if xlim:
        ax.set_xlim(xlim)
    if ylim:
        ax.set_ylim(ylim)
    ax.set_title('Number of reviews by neighbourhood in ' + city)
    comments_by_neighbourhood_geodf.plot(col_name, cmap=color, ax=ax, edgecolor='#DDDDDD');

Plot listings by neighbourhood definition

In [147]:
def plot_listings_by_neighbourhood_on_map1(listings_by_neighbourhood, xsize, ysize, df):
    fig, axs = plt.subplots(xsize, ysize, figsize=(15, 10))
    color = 'Oranges'
    sm = plt.cm.ScalarMappable(cmap=plt.cm.get_cmap(color, 5), norm=plt.Normalize(vmin=0, vmax=group_listings_by_neighbourhood(df)['listings_count'].max()))
    sm._A = []
    cbar = fig.colorbar(sm)
    cbar.ax.tick_params()
    i = 0
    j = 0
    for a in listings_by_neighbourhood:
        xlim = None
        ylim = None
        if len(a) == 4:
            [listings_geodf, city, xlim, ylim] = a
        else:
            [listings_geodf, city] = a
            
        if xsize == 1 & ysize == 1:
            ax = axs
        elif xsize == 1:
            ax = axs[i]
        elif ysize == 1:
            ax = axs[j]
        else:
            ax = axs[i, j]
        plot_listings_by_neighbourhood_on_map2(axs[i, j], fig, color, listings_geodf, city, xlim, ylim)
        j = j + 1
        if j > (xsize - 1): 
            i = i + 1
            j = 0
In [148]:
def plot_listings_by_neighbourhood_on_map2(ax, fig, color, listings_by_neighbourhood_geodf, city, xlim=None, ylim=None):
    ax.axis('off')
    if xlim:
        ax.set_xlim(xlim)
    if ylim:
        ax.set_ylim(ylim)
    ax.set_title('Number of listings by neighbourhood in ' + city)
    listings_by_neighbourhood_geodf.plot('listings_count', cmap=color, ax=ax, edgecolor='#DDDDDD');

Plot comments and listings definition

In [133]:
def get_legend(value, texts):
    return texts[int(value)]

def plot_comments_listings_bins(title, df, field_name, ax, bins_names, colors_map):
    df.plot(field_name, ax=ax, edgecolor='#DDDDDD',
            cmap=colors_map(df[field_name].values), legend=True, categorical=True)
    
    for leg_text in ax.get_legend().get_texts():
        leg_text.set_text(get_legend(leg_text.get_text(), bins_names))

    ax.set_title(title)

def comments_listings_one_plot(comments_listings_by_neighbourhood, xlim=None, ylim=None, city=None):
    
    fig,([ax1,ax2],[ax3,ax4]) = plt.subplots(2, 2, figsize=(15,10), sharex=True, sharey=True, 
                                             subplot_kw=dict(aspect='equal'))
    
    plot_comments_listings_bins('English comments', comments_listings_by_neighbourhood, 
                                'comments_english_count_bin', ax1, bins_comments_names_western, reviews_colors_map)
    ax1.set_xlim(xlim)
    ax1.set_ylim(ylim)
    
    plot_comments_listings_bins('Non English comments', comments_listings_by_neighbourhood, 
                                'comments_non_english_count_bin', ax2, bins_comments_names_western, reviews_colors_map)
    plot_comments_listings_bins('All comments', comments_listings_by_neighbourhood, 
                                'comments_count_bin', ax3, bins_comments_names_western, reviews_colors_map)
    plot_comments_listings_bins('Listings', comments_listings_by_neighbourhood, 
                                'listings_count_bin', ax4, bins_listings_names_western, listings_colors_map)
    fig.suptitle(city, y=1.05)
    
    for ax in (ax1,ax2,ax3,ax4):
        ax.axis('off')
        
    fig.tight_layout()    
    plt.show();

Comments and listings penetration distribution by neighbourhoods

Comments penetration distribution by neighbourhoods

Plot definition
In [17]:
def percentage(part, all):
    return round(part/all*100, 2)
In [190]:
def plot_comments_penetration_distribution_by_neighbourhoods():
    fig, ax = plt.subplots(1, 2, figsize=(10,5))

    N, bins, patches = ax[0].hist(western_comments.groupby(['city', 'neighbourhood_cleansed'])['comments'].count(), bins=20)
    ax[0].set_title('Review penetration by neighbourhood in Western countries', pad=5)
    ax[0].set_xlabel('Review penetration')  
    ax[0].set_ylabel('Count')
    ax[0].set_yscale('log')

    bin_1_index = 1
    bin_2_index = 5
    bins_comments_western = [bins[bin_1_index], bins[bin_2_index], bins[-1]]
    bins_comments_names_western = ['Low', 'Medium', 'High']
    bins_comments_share_western = [percentage(sum(N[0:bin_1_index]), sum(N)), percentage(sum(N[bin_1_index:bin_2_index]), sum(N)), percentage(sum(N[bin_2_index:]), sum(N))]
    print('Western comments bins: {}'.format(bins_comments_western))
    print('Western comments bins names: {}'.format(bins_comments_names_western))
    print('Western comments bins share % : {}'.format(bins_comments_share_western))

    handles = [Rectangle((0,0),1,1, color=c, ec="k") for c in [reviews_color_palette[0], reviews_color_palette[1], reviews_color_palette[2]]]
    ax[0].legend(handles, bins_comments_names_western)

    for i in range(0,bin_1_index):
        patches[i].set_facecolor(reviews_color_palette[0])
    for i in range(bin_1_index,bin_2_index):    
        patches[i].set_facecolor(reviews_color_palette[1])
    for i in range(bin_2_index, len(patches)):
        patches[i].set_facecolor(reviews_color_palette[2])

    N, bins, patches = ax[1].hist(non_western_comments.groupby(['city', 'neighbourhood_cleansed'])['comments'].count(), bins=20)
    ax[1].set_title('Review penetration by neighbourhood in Non-Western countries', pad=5)
    ax[1].set_xlabel('Review penetration')  
    ax[1].set_ylabel('Count')
    ax[1].set_yscale('log')

    bin_1_index = 1
    bin_2_index = 5
    bins_comments_non_western = [bins[bin_1_index], bins[bin_2_index], bins[-1]]
    bins_comments_names_non_western = ['Low', 'Medium', 'High']
    bins_comments_share_non_western = [percentage(sum(N[0:bin_1_index]), sum(N)), percentage(sum(N[bin_1_index:bin_2_index]), sum(N)), percentage(sum(N[bin_2_index:]), sum(N))]
    print('Non-Western comments bins: {}'.format(bins_comments_non_western))
    print('Non-Western comments bins names: {}'.format(bins_comments_names_non_western))
    print('Non-Western comments bins share %: {}'.format(bins_comments_share_non_western))

    handles = [Rectangle((0,0),1,1, color=c, ec="k") for c in [reviews_color_palette[0], reviews_color_palette[1], reviews_color_palette[2]]]
    ax[1].legend(handles, bins_comments_names_non_western)

    for i in range(0,bin_1_index):
        patches[i].set_facecolor(reviews_color_palette[0])
    for i in range(bin_1_index,bin_2_index):    
        patches[i].set_facecolor(reviews_color_palette[1])
    for i in range(bin_2_index, len(patches)):
        patches[i].set_facecolor(reviews_color_palette[2])

    fig.tight_layout()
    plt.show();
Plot comments penetration distribution by neighbourhoods
In [191]:
plot_comments_penetration_distribution_by_neighbourhoods()
Western comments bins: [11818.55, 59080.75, 236314.0]
Western comments bins names: ['Low', 'Medium', 'High']
Western comments bins share % : [80.26, 14.47, 5.26]
Non-Western comments bins: [13947.8, 69735.0, 278937.0]
Non-Western comments bins names: ['Low', 'Medium', 'High']
Non-Western comments bins share %: [88.38, 9.48, 2.14]

Listings penetration distribution by neighbourhoods

Plot definition
In [188]:
def plot_listings_penetration_distribution_by_neighbourhoods():
    fig, ax = plt.subplots(1, 2, figsize=(10,5))

    N, bins, patches = ax[0].hist(listings_western_df.groupby(['city', 'neighbourhood_cleansed'])['id'].count(), bins=20)
    ax[0].set_title('Listing penetration by neighbourhood in Western countries', pad=5)
    ax[0].set_xlabel('Listing penetration')  
    ax[0].set_ylabel('Count')
    ax[0].set_yscale('log')

    bin_1_index = 1
    bin_2_index = 5
    bins_listings_western = [bins[bin_1_index], bins[bin_2_index], bins[-1]]
    bins_listings_names_western = ['Low', 'Medium', 'High']
    bins_listings_share_western = [percentage(sum(N[0:bin_1_index]), sum(N)), percentage(sum(N[bin_1_index:bin_2_index]), sum(N)), percentage(sum(N[bin_2_index:]), sum(N))]
    print('Western listings bins: {}'.format(bins_listings_western))
    print('Western listings bins names: {}'.format(bins_listings_names_western))
    print('Western listings bins share % : {}'.format(bins_listings_share_western))

    handles = [Rectangle((0,0),1,1, color=c, ec="k") for c in [listings_color_palette[0], listings_color_palette[1], listings_color_palette[2]]]
    ax[0].legend(handles, bins_listings_names_western)

    for i in range(0,bin_1_index):
        patches[i].set_facecolor(listings_color_palette[0])
    for i in range(bin_1_index,bin_2_index):    
        patches[i].set_facecolor(listings_color_palette[1])
    for i in range(bin_2_index, len(patches)):
        patches[i].set_facecolor(listings_color_palette[2])

    N, bins, patches = ax[1].hist(listings_non_western_df.groupby(['city', 'neighbourhood_cleansed'])['id'].count(), bins=20)
    ax[1].set_title('Listing penetration by neighbourhood in Non-Western countries', pad=5)
    ax[1].set_xlabel('Listing penetration')  
    ax[1].set_ylabel('Count')
    ax[1].set_yscale('log')

    bin_1_index = 1
    bin_2_index = 5
    bins_listings_non_western = [bins[bin_1_index], bins[bin_2_index], bins[-1]]
    bins_listings_names_non_western = ['Low', 'Medium', 'High']
    bins_listings_share_non_western = [percentage(sum(N[0:bin_1_index]), sum(N)), percentage(sum(N[bin_1_index:bin_2_index]), sum(N)), percentage(sum(N[bin_2_index:]), sum(N))]
    print('Non-Western listings bins: {}'.format(bins_listings_non_western))
    print('Non-Western listings bins names: {}'.format(bins_listings_names_non_western))
    print('Non-Western listings bins share %: {}'.format(bins_listings_share_non_western))

    handles = [Rectangle((0,0),1,1, color=c, ec="k") for c in [listings_color_palette[0], listings_color_palette[1], listings_color_palette[2]]]
    ax[1].legend(handles, bins_listings_names_non_western)

    for i in range(0,bin_1_index):
        patches[i].set_facecolor(listings_color_palette[0])
    for i in range(bin_1_index,bin_2_index):    
        patches[i].set_facecolor(listings_color_palette[1])
    for i in range(bin_2_index, len(patches)):
        patches[i].set_facecolor(listings_color_palette[2])

    fig.tight_layout()
    plt.show();
Plot listings penetration distribution by neighbourhoods
In [189]:
plot_listings_penetration_distribution_by_neighbourhoods()
Western listings bins: [487.3, 2432.5, 9727.0]
Western listings bins names: ['Low', 'Medium', 'High']
Western listings bins share % : [78.9, 15.91, 5.19]
Non-Western listings bins: [595.6, 2974.0, 11893.0]
Non-Western listings bins names: ['Low', 'Medium', 'High']
Non-Western listings bins share %: [82.78, 13.33, 3.89]

Western countries analysis

Preprocessing

Filter cities from df
In [25]:
london_comments_df_english = filter_city(western_df_english, 'London')
london_comments_df_non_english = filter_city(western_df_non_english, 'London')
london_listings_df = filter_city(listings_western_df, 'London')
london_map_geodf = read_geo_file('neighbourhoods/london.geojson')

new_york_comments_df_english = filter_city(western_df_english, 'New York')
new_york_comments_df_non_english = filter_city(western_df_non_english, 'New York')
new_york_listings_df = filter_city(listings_western_df, 'New York')
new_york_map_geodf = read_geo_file('neighbourhoods/new_york.geojson')

melbourne_comments_df_english = filter_city(western_df_english, 'Melbourne')
melbourne_comments_df_non_english = filter_city(western_df_non_english, 'Melbourne')
melbourne_listings_df = filter_city(listings_western_df, 'Melbourne')
melbourne_map_geodf = read_geo_file('neighbourhoods/melbourne.geojson')

vancouver_comments_df_english = filter_city(western_df_english, 'Vancouver')
vancouver_comments_df_non_english = filter_city(western_df_non_english, 'Vancouver')
vancouver_listings_df = filter_city(listings_western_df, 'Vancouver')
vancouver_map_geodf = read_geo_file('neighbourhoods/vancouver.geojson')
Group comments English
In [135]:
london_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(
                                                        london_map_geodf, group_comments_by_neighbourhood(
                                                        london_comments_df_english, 'comments_english_count'), 
                                                        'comments_english_count')
london_comments_english_geodf = as_geodf(london_comments_df_english)

new_york_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(
                                                        new_york_map_geodf, group_comments_by_neighbourhood(
                                                        new_york_comments_df_english, 'comments_english_count'), 
                                                        'comments_english_count')
new_york_comments_english_geodf = as_geodf(new_york_comments_df_english)

melbourne_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(
                                                        melbourne_map_geodf, group_comments_by_neighbourhood(
                                                        melbourne_comments_df_english, 'comments_english_count'), 
                                                        'comments_english_count')
melbourne_comments_english_geodf = as_geodf(melbourne_comments_df_english)

vancouver_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(
                                                        vancouver_map_geodf, group_comments_by_neighbourhood(
                                                        vancouver_comments_df_english, 'comments_english_count'), 
                                                        'comments_english_count')
vancouver_comments_english_geodf = as_geodf(vancouver_comments_df_english)
Group comments non English
In [137]:
london_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(london_map_geodf, group_comments_by_neighbourhood(london_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
london_comments_non_english_geodf = as_geodf(london_comments_df_non_english)

new_york_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(new_york_map_geodf, group_comments_by_neighbourhood(new_york_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
new_york_comments_non_english_geodf = as_geodf(new_york_comments_df_non_english)

melbourne_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(melbourne_map_geodf, group_comments_by_neighbourhood(melbourne_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
melbourne_comments_non_english_geodf = as_geodf(melbourne_comments_df_non_english)

vancouver_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(vancouver_map_geodf, group_comments_by_neighbourhood(vancouver_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
vancouver_comments_non_english_geodf = as_geodf(vancouver_comments_df_non_english)
Group listings
In [141]:
london_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(london_map_geodf, group_listings_by_neighbourhood(london_listings_df))
london_listings_geodf = as_geodf(london_listings_df)

new_york_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(new_york_map_geodf, group_listings_by_neighbourhood(new_york_listings_df))
new_york_listings_geodf = as_geodf(new_york_listings_df)

melbourne_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(melbourne_map_geodf, group_listings_by_neighbourhood(melbourne_listings_df))
melbourne_listings_geodf = as_geodf(melbourne_listings_df)

vancouver_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(vancouver_map_geodf, group_listings_by_neighbourhood(vancouver_listings_df))
vancouver_listings_geodf = as_geodf(vancouver_listings_df)
Group comments and listings by neighbourhood
In [36]:
london_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(
                                london_comments_df_english, london_comments_df_non_english, london_listings_df)
new_york_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(
                                new_york_comments_df_english, new_york_comments_df_non_english, new_york_listings_df)
melbourne_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(
                                melbourne_comments_df_english, melbourne_comments_df_non_english, melbourne_listings_df)
vancouver_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(
                                vancouver_comments_df_english, vancouver_comments_df_non_english, vancouver_listings_df)
Group comments and listings with map
In [37]:
london_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(
                                                    london_map_geodf, london_comments_listings_by_neighbourhood)
new_york_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(
                                                    new_york_map_geodf, new_york_comments_listings_by_neighbourhood)
melbourne_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(
                                                    melbourne_map_geodf, melbourne_comments_listings_by_neighbourhood)
vancouver_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(
                                                    vancouver_map_geodf, vancouver_comments_listings_by_neighbourhood)
Assign bins
In [139]:
assign_bins(london_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_western, bins_listings_western)
assign_bins(new_york_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_western, bins_listings_western)
assign_bins(melbourne_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_western, bins_listings_western)
assign_bins(vancouver_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_western, bins_listings_western)

Plot comments English

In [136]:
plot_comments_on_map1(
    [[london_map_geodf, london_comments_english_geodf, 'London'], 
     [new_york_map_geodf, new_york_comments_english_geodf, 'New York'],
     [melbourne_map_geodf, melbourne_comments_english_geodf, 'Melbourne'],
     [vancouver_map_geodf, vancouver_comments_english_geodf, 'Vancouver']],
    2, 2, western_df_english)

Plot comments non English

In [138]:
plot_comments_on_map1(
    [[london_map_geodf, london_comments_non_english_geodf, 'London'], 
     [new_york_map_geodf, new_york_comments_non_english_geodf, 'New York'],
     [melbourne_map_geodf, melbourne_comments_non_english_geodf, 'Melbourne'],
     [vancouver_map_geodf, vancouver_comments_non_english_geodf, 'Vancouver']],
    2, 2, western_df_non_english)

Plot listings

In [142]:
plot_listings_on_map1(
    [[london_map_geodf, london_listings_geodf, 'London'], 
     [new_york_map_geodf, new_york_listings_geodf, 'New York'],
     [melbourne_map_geodf, melbourne_listings_geodf, 'Melbourne'],
     [vancouver_map_geodf, vancouver_listings_geodf, 'Vancouver']],
    2, 2, listings_western_df)

Plot comments English by neighbourhood

In [145]:
plot_comments_by_neighbourhood_on_map1(
    [[london_comments_english_by_neighbourhood_on_map_geodf, 'London'], 
     [new_york_comments_english_by_neighbourhood_on_map_geodf, 'New York'],
     [melbourne_comments_english_by_neighbourhood_on_map_geodf, 'Melbourne'],
     [vancouver_comments_english_by_neighbourhood_on_map_geodf, 'Vancouver']], 
    2, 2, western_df_english, 'comments_english_count')

Plot non English comments by neighbourhood

In [146]:
plot_comments_by_neighbourhood_on_map1(
    [[london_comments_non_english_by_neighbourhood_on_map_geodf, 'London'], 
     [new_york_comments_non_english_by_neighbourhood_on_map_geodf, 'New York'],
     [melbourne_comments_non_english_by_neighbourhood_on_map_geodf, 'Melbourne'],
     [vancouver_comments_non_english_by_neighbourhood_on_map_geodf, 'Vancouver']], 
    2, 2, western_df_non_english, 'comments_non_english_count')

Plot listings by neighbourhood

In [149]:
plot_listings_by_neighbourhood_on_map1(
    [[london_listings_by_neighbourhood_on_map_geodf, 'London'], 
     [new_york_listings_by_neighbourhood_on_map_geodf, 'New York'],
     [melbourne_listings_by_neighbourhood_on_map_geodf, 'Melbourne'],
     [vancouver_listings_by_neighbourhood_on_map_geodf, 'Vancouver']], 
    2, 2, listings_western_df)

Plot English, non English comments and listings on one plot

In [140]:
comments_listings_one_plot(london_comments_listings_by_neighbourhood_on_map_geodf, city='London')
In [141]:
comments_listings_one_plot(new_york_comments_listings_by_neighbourhood_on_map_geodf, city='New York')
In [142]:
comments_listings_one_plot(melbourne_comments_listings_by_neighbourhood_on_map_geodf, city='Melbourne')
In [143]:
comments_listings_one_plot(vancouver_comments_listings_by_neighbourhood_on_map_geodf, city='Vancouver')

Non-Western countries analysis

Preprocessing

Filter cities from df
In [145]:
beijing_comments_df_english = filter_city(non_western_df_english, 'Beijing')
beijing_comments_df_non_english = filter_city(non_western_df_non_english, 'Beijing')
beijing_listings_df = filter_city(listings_non_western_df, 'Beijing')
beijing_map_geodf = read_geo_file('neighbourhoods/beijing.geojson')

belize_comments_df_english = filter_city(non_western_df_english, 'Belize')
belize_comments_df_non_english = filter_city(non_western_df_non_english, 'Belize')
belize_listings_df = filter_city(listings_non_western_df, 'Belize')
belize_map_geodf = read_geo_file('neighbourhoods/belize.geojson')

buenos_aires_comments_df_english = filter_city(non_western_df_english, 'Buenos Aires')
buenos_aires_comments_df_non_english = filter_city(non_western_df_non_english, 'Buenos Aires')
buenos_aires_listings_df = filter_city(listings_non_western_df, 'Buenos Aires')
buenos_aires_map_geodf = read_geo_file('neighbourhoods/buenos_aires.geojson')

hong_kong_comments_df_english = filter_city(non_western_df_english, 'Hong Kong')
hong_kong_comments_df_non_english = filter_city(non_western_df_non_english, 'Hong Kong')
hong_kong_listings_df = filter_city(listings_non_western_df, 'Hong Kong')
hong_kong_map_geodf = read_geo_file('neighbourhoods/hong_kong.geojson')

mexico_city_comments_df_english = filter_city(non_western_df_english, 'Mexico City')
mexico_city_comments_df_non_english = filter_city(non_western_df_non_english, 'Mexico City')
mexico_city_listings_df = filter_city(listings_non_western_df, 'Mexico City')
mexico_city_map_geodf = read_geo_file('neighbourhoods/mexico_city.geojson')

rio_de_janeiro_comments_df_english = filter_city(non_western_df_english, 'Rio de Janeiro')
rio_de_janeiro_comments_df_non_english = filter_city(non_western_df_non_english, 'Rio de Janeiro')
rio_de_janeiro_listings_df = filter_city(listings_non_western_df, 'Rio de Janeiro')
rio_de_janeiro_map_geodf = read_geo_file('neighbourhoods/rio_de_janeiro.geojson')

santiago_comments_df_english = filter_city(non_western_df_english, 'Santiago')
santiago_comments_df_non_english = filter_city(non_western_df_non_english, 'Santiago')
santiago_listings_df = filter_city(listings_non_western_df, 'Santiago')
santiago_map_geodf = read_geo_file('neighbourhoods/santiago.geojson')

taipei_comments_df_english = filter_city(non_western_df_english, 'Taipei')
taipei_comments_df_non_english = filter_city(non_western_df_non_english, 'Taipei')
taipei_listings_df = filter_city(listings_non_western_df, 'Taipei')
taipei_map_geodf = read_geo_file('neighbourhoods/taipei.geojson')

tokyo_comments_df_english = filter_city(non_western_df_english, 'Tokyo')
tokyo_comments_df_non_english = filter_city(non_western_df_non_english, 'Tokyo')
tokyo_listings_df = filter_city(listings_non_western_df, 'Tokyo')
tokyo_map_geodf = read_geo_file('neighbourhoods/tokyo.geojson')
Group comments English
In [146]:
beijing_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(beijing_map_geodf, group_comments_by_neighbourhood(beijing_comments_df_english, 'comments_english_count'), 'comments_english_count')
beijing_comments_english_geodf = as_geodf(beijing_comments_df_english)

belize_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(belize_map_geodf, group_comments_by_neighbourhood(belize_comments_df_english, 'comments_english_count'), 'comments_english_count')
belize_comments_english_geodf = as_geodf(belize_comments_df_english)

buenos_aires_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(buenos_aires_map_geodf, group_comments_by_neighbourhood(buenos_aires_comments_df_english, 'comments_english_count'), 'comments_english_count')
buenos_aires_comments_english_geodf = as_geodf(buenos_aires_comments_df_english)

hong_kong_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(hong_kong_map_geodf, group_comments_by_neighbourhood(hong_kong_comments_df_english, 'comments_english_count'), 'comments_english_count')
hong_kong_comments_english_geodf = as_geodf(hong_kong_comments_df_english)

mexico_city_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(mexico_city_map_geodf, group_comments_by_neighbourhood(mexico_city_comments_df_english, 'comments_english_count'), 'comments_english_count')
mexico_city_comments_english_geodf = as_geodf(mexico_city_comments_df_english)

rio_de_janeiro_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(rio_de_janeiro_map_geodf, group_comments_by_neighbourhood(rio_de_janeiro_comments_df_english, 'comments_english_count'), 'comments_english_count')
rio_de_janeiro_comments_english_geodf = as_geodf(rio_de_janeiro_comments_df_english)

santiago_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(santiago_map_geodf, group_comments_by_neighbourhood(santiago_comments_df_english, 'comments_english_count'), 'comments_english_count')
santiago_comments_english_geodf = as_geodf(santiago_comments_df_english)

taipei_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(taipei_map_geodf, group_comments_by_neighbourhood(taipei_comments_df_english, 'comments_english_count'), 'comments_english_count')
taipei_comments_english_geodf = as_geodf(taipei_comments_df_english)

tokyo_comments_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(tokyo_map_geodf, group_comments_by_neighbourhood(tokyo_comments_df_english, 'comments_english_count'), 'comments_english_count')
tokyo_comments_english_geodf = as_geodf(tokyo_comments_df_english)
Group comments non English
In [147]:
beijing_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(beijing_map_geodf, group_comments_by_neighbourhood(beijing_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
beijing_comments_non_english_geodf = as_geodf(beijing_comments_df_non_english)

belize_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(belize_map_geodf, group_comments_by_neighbourhood(belize_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
belize_comments_non_english_geodf = as_geodf(belize_comments_df_non_english)

buenos_aires_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(buenos_aires_map_geodf, group_comments_by_neighbourhood(buenos_aires_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
buenos_aires_comments_non_english_geodf = as_geodf(buenos_aires_comments_df_non_english)

hong_kong_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(hong_kong_map_geodf, group_comments_by_neighbourhood(hong_kong_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
hong_kong_comments_non_english_geodf = as_geodf(hong_kong_comments_df_non_english)

mexico_city_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(mexico_city_map_geodf, group_comments_by_neighbourhood(mexico_city_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
mexico_city_comments_non_english_geodf = as_geodf(mexico_city_comments_df_non_english)

rio_de_janeiro_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(rio_de_janeiro_map_geodf, group_comments_by_neighbourhood(rio_de_janeiro_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
rio_de_janeiro_comments_non_english_geodf = as_geodf(rio_de_janeiro_comments_df_non_english)

santiago_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(santiago_map_geodf, group_comments_by_neighbourhood(santiago_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
santiago_comments_non_english_geodf = as_geodf(santiago_comments_df_non_english)

taipei_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(taipei_map_geodf, group_comments_by_neighbourhood(taipei_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
taipei_comments_non_english_geodf = as_geodf(taipei_comments_df_non_english)

tokyo_comments_non_english_by_neighbourhood_on_map_geodf = join_comments_by_neighbourhood_with_map(tokyo_map_geodf, group_comments_by_neighbourhood(tokyo_comments_df_non_english, 'comments_non_english_count'), 'comments_non_english_count')
tokyo_comments_non_english_geodf = as_geodf(tokyo_comments_df_non_english)
Group listings
In [148]:
beijing_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(beijing_map_geodf, group_listings_by_neighbourhood(beijing_listings_df))
beijing_listings_geodf = as_geodf(beijing_listings_df)

belize_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(belize_map_geodf, group_listings_by_neighbourhood(belize_listings_df))
belize_listings_geodf = as_geodf(belize_listings_df)

buenos_aires_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(buenos_aires_map_geodf, group_listings_by_neighbourhood(buenos_aires_listings_df))
buenos_aires_listings_geodf = as_geodf(buenos_aires_listings_df)

hong_kong_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(hong_kong_map_geodf, group_listings_by_neighbourhood(hong_kong_listings_df))
hong_kong_listings_geodf = as_geodf(hong_kong_listings_df)

mexico_city_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(mexico_city_map_geodf, group_listings_by_neighbourhood(mexico_city_listings_df))
mexico_city_listings_geodf = as_geodf(mexico_city_listings_df)

rio_de_janeiro_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(rio_de_janeiro_map_geodf, group_listings_by_neighbourhood(rio_de_janeiro_listings_df))
rio_de_janeiro_listings_geodf = as_geodf(rio_de_janeiro_listings_df)

santiago_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(santiago_map_geodf, group_listings_by_neighbourhood(santiago_listings_df))
santiago_listings_geodf = as_geodf(santiago_listings_df)

taipei_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(taipei_map_geodf, group_listings_by_neighbourhood(taipei_listings_df))
taipei_listings_geodf = as_geodf(taipei_listings_df)

tokyo_listings_by_neighbourhood_on_map_geodf = join_listings_by_neighbourhood_with_map(tokyo_map_geodf, group_listings_by_neighbourhood(tokyo_listings_df))
tokyo_listings_geodf = as_geodf(tokyo_listings_df)
Group comments and listings by neighbourhood
In [149]:
beijing_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(beijing_comments_df_english, 
                                                            beijing_comments_df_non_english, beijing_listings_df)
belize_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(belize_comments_df_english, 
                                                            belize_comments_df_non_english, belize_listings_df)
buenos_aires_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(buenos_aires_comments_df_english, 
                                                            buenos_aires_comments_df_non_english, buenos_aires_listings_df)
hong_kong_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(hong_kong_comments_df_english, 
                                                            hong_kong_comments_df_non_english, hong_kong_listings_df)
mexico_city_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(mexico_city_comments_df_english, 
                                                            mexico_city_comments_df_non_english, mexico_city_listings_df)
rio_de_janeiro_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(rio_de_janeiro_comments_df_english, 
                                                            rio_de_janeiro_comments_df_non_english, rio_de_janeiro_listings_df)
santiago_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(santiago_comments_df_english, 
                                                            santiago_comments_df_non_english, santiago_listings_df)
taipei_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(taipei_comments_df_english, 
                                                            taipei_comments_df_non_english, taipei_listings_df)
tokyo_comments_listings_by_neighbourhood = merge_calculate_sum_by_neighbourhood(tokyo_comments_df_english, 
                                                            tokyo_comments_df_non_english, tokyo_listings_df)
Merge comments and listings with map
In [150]:
beijing_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(beijing_map_geodf, beijing_comments_listings_by_neighbourhood)
belize_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(belize_map_geodf, belize_comments_listings_by_neighbourhood)
buenos_aires_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(buenos_aires_map_geodf, buenos_aires_comments_listings_by_neighbourhood)
hong_kong_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(hong_kong_map_geodf, hong_kong_comments_listings_by_neighbourhood)
mexico_city_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(mexico_city_map_geodf, mexico_city_comments_listings_by_neighbourhood)
rio_de_janeiro_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(rio_de_janeiro_map_geodf, rio_de_janeiro_comments_listings_by_neighbourhood)
santiago_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(santiago_map_geodf, santiago_comments_listings_by_neighbourhood)
taipei_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(taipei_map_geodf, taipei_comments_listings_by_neighbourhood)
tokyo_comments_listings_by_neighbourhood_on_map_geodf = join_comments_and_listings_by_neighbourhood_with_map(tokyo_map_geodf, tokyo_comments_listings_by_neighbourhood)
Assign bins
In [151]:
assign_bins(beijing_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_non_western, bins_listings_non_western)
assign_bins(belize_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_non_western, bins_listings_non_western)
assign_bins(buenos_aires_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_non_western, bins_listings_non_western)
assign_bins(hong_kong_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_non_western, bins_listings_non_western)
assign_bins(mexico_city_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_non_western, bins_listings_non_western)
assign_bins(rio_de_janeiro_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_non_western, bins_listings_non_western)
assign_bins(santiago_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_non_western, bins_listings_non_western)
assign_bins(taipei_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_non_western, bins_listings_non_western)
assign_bins(tokyo_comments_listings_by_neighbourhood_on_map_geodf, bins_comments_non_western, bins_listings_non_western)

Plot English comments

In [156]:
plot_comments_on_map1(
    [[beijing_map_geodf, beijing_comments_english_geodf, 'Beijing'], 
     [belize_map_geodf, belize_comments_english_geodf, 'Belize'],
     [buenos_aires_map_geodf, buenos_aires_comments_english_geodf, 'Buenos Aires'],
     [hong_kong_map_geodf, hong_kong_comments_english_geodf, 'Hong Kong'], 
     [mexico_city_map_geodf, mexico_city_comments_english_geodf, 'Mexico City'],
     [rio_de_janeiro_map_geodf, rio_de_janeiro_comments_english_geodf, 'Rio de Janeiro'],
     [santiago_map_geodf, santiago_comments_english_geodf, 'Santiago'], 
     [taipei_map_geodf, taipei_comments_english_geodf, 'Taipei'],
     [tokyo_map_geodf, tokyo_comments_english_geodf, 'Tokyo', (138.8,140), (35.4,36)]],
    3, 3, non_western_df_english)

Plot non English comments

In [157]:
plot_comments_on_map1(
    [[beijing_map_geodf, beijing_comments_non_english_geodf, 'Beijing'], 
     [belize_map_geodf, belize_comments_non_english_geodf, 'Belize'],
     [buenos_aires_map_geodf, buenos_aires_comments_non_english_geodf, 'Buenos Aires'],
     [hong_kong_map_geodf, hong_kong_comments_non_english_geodf, 'Hong Kong'], 
     [mexico_city_map_geodf, mexico_city_comments_non_english_geodf, 'Mexico City'],
     [rio_de_janeiro_map_geodf, rio_de_janeiro_comments_non_english_geodf, 'Rio de Janeiro'],
     [santiago_map_geodf, santiago_comments_non_english_geodf, 'Santiago'], 
     [taipei_map_geodf, taipei_comments_non_english_geodf, 'Taipei'],
     [tokyo_map_geodf, tokyo_comments_non_english_geodf, 'Tokyo', (138.8,140), (35.4,36)]],
    3, 3, non_western_df_non_english)

Plot listings

In [158]:
plot_listings_on_map1(
    [[beijing_map_geodf, beijing_listings_geodf, 'Beijing'], 
     [belize_map_geodf, belize_listings_geodf, 'Belize'],
     [buenos_aires_map_geodf, buenos_aires_listings_geodf, 'Buenos Aires'],
     [hong_kong_map_geodf, hong_kong_listings_geodf, 'Hong Kong'], 
     [mexico_city_map_geodf, mexico_city_listings_geodf, 'Mexico City'],
     [rio_de_janeiro_map_geodf, rio_de_janeiro_listings_geodf, 'Rio de Janeiro'],
     [santiago_map_geodf, santiago_listings_geodf, 'Santiago'], 
     [taipei_map_geodf, taipei_listings_geodf, 'Taipei'],
     [tokyo_map_geodf, tokyo_listings_geodf, 'Tokyo', (138.8,140), (35.4,36)]],
    3, 3, listings_non_western_df)

Plot comments English by neighbourhood

In [159]:
plot_comments_by_neighbourhood_on_map1(
    [[beijing_comments_english_by_neighbourhood_on_map_geodf, 'Beijing'], 
     [belize_comments_english_by_neighbourhood_on_map_geodf, 'Belize'],
     [buenos_aires_comments_english_by_neighbourhood_on_map_geodf, 'Buenos Aires'],
     [hong_kong_comments_english_by_neighbourhood_on_map_geodf, 'Hong Kong'], 
     [mexico_city_comments_english_by_neighbourhood_on_map_geodf, 'Mexico City'],
     [rio_de_janeiro_comments_english_by_neighbourhood_on_map_geodf, 'Rio de Janeiro'],
     [santiago_comments_english_by_neighbourhood_on_map_geodf, 'Santiago'], 
     [taipei_comments_english_by_neighbourhood_on_map_geodf, 'Taipei'],
     [tokyo_comments_english_by_neighbourhood_on_map_geodf, 'Tokyo', (138.8,140), (35.4,36)]],
    3, 3, non_western_df_english, 'comments_english_count')

Plot comment non English by neighbourhood

In [160]:
plot_comments_by_neighbourhood_on_map1(
    [[beijing_comments_non_english_by_neighbourhood_on_map_geodf, 'Beijing'], 
     [belize_comments_non_english_by_neighbourhood_on_map_geodf, 'Belize'],
     [buenos_aires_comments_non_english_by_neighbourhood_on_map_geodf, 'Buenos Aires'],
     [hong_kong_comments_non_english_by_neighbourhood_on_map_geodf, 'Hong Kong'], 
     [mexico_city_comments_non_english_by_neighbourhood_on_map_geodf, 'Mexico City'],
     [rio_de_janeiro_comments_non_english_by_neighbourhood_on_map_geodf, 'Rio de Janeiro'],
     [santiago_comments_non_english_by_neighbourhood_on_map_geodf, 'Santiago'], 
     [taipei_comments_non_english_by_neighbourhood_on_map_geodf, 'Taipei'],
     [tokyo_comments_non_english_by_neighbourhood_on_map_geodf, 'Tokyo', (138.8,140), (35.4,36)]],
    3, 3, non_western_df_non_english, 'comments_non_english_count')

Plot listings by neighbourhood

In [161]:
plot_listings_by_neighbourhood_on_map1(
    [[beijing_listings_by_neighbourhood_on_map_geodf, 'Beijing'], 
     [belize_listings_by_neighbourhood_on_map_geodf, 'Belize'],
     [buenos_aires_listings_by_neighbourhood_on_map_geodf, 'Buenos Aires'],
     [hong_kong_listings_by_neighbourhood_on_map_geodf, 'Hong Kong'], 
     [mexico_city_listings_by_neighbourhood_on_map_geodf, 'Mexico City'],
     [rio_de_janeiro_listings_by_neighbourhood_on_map_geodf, 'Rio de Janeiro'],
     [santiago_listings_by_neighbourhood_on_map_geodf, 'Santiago'], 
     [taipei_listings_by_neighbourhood_on_map_geodf, 'Taipei'],
     [tokyo_listings_by_neighbourhood_on_map_geodf, 'Tokyo', (138.8,140), (35.4,36)]],
    3, 3, listings_non_western_df)

Plot comments and listings on one plot

In [152]:
comments_listings_one_plot(beijing_comments_listings_by_neighbourhood_on_map_geodf, city='Beijing')
In [153]:
comments_listings_one_plot(belize_comments_listings_by_neighbourhood_on_map_geodf, city='Belize')
In [154]:
comments_listings_one_plot(buenos_aires_comments_listings_by_neighbourhood_on_map_geodf, city='Buenos Aires')
In [155]:
comments_listings_one_plot(hong_kong_comments_listings_by_neighbourhood_on_map_geodf, city='Hong Kong')
In [156]:
comments_listings_one_plot(mexico_city_comments_listings_by_neighbourhood_on_map_geodf, city='Mexico City')
In [157]:
comments_listings_one_plot(rio_de_janeiro_comments_listings_by_neighbourhood_on_map_geodf, city='Rio de Janeiro')
In [158]:
comments_listings_one_plot(santiago_comments_listings_by_neighbourhood_on_map_geodf, city='Santiago')
In [159]:
comments_listings_one_plot(taipei_comments_listings_by_neighbourhood_on_map_geodf, city='Taipei')
In [160]:
comments_listings_one_plot(tokyo_comments_listings_by_neighbourhood_on_map_geodf, xlim=(138.8,140), ylim=(35.4,36), city='Tokyo')

Pearson Correlation between comments and listings penetration by neighbourhoods

Preprocessing

Merge comments and listings to one dataframe
In [87]:
western_comments_listings_by_neighbourhood = pd.DataFrame.merge(western_comments.groupby(
                                            ['city','neighbourhood_cleansed'])['comments'].count().reset_index()\
                                            .rename(columns={'comments':'comments_penetration'}),
                                            listings_western_df.groupby(['city','neighbourhood_cleansed'])['id']\
                                            .count().reset_index().rename(columns={'id':'listings_penetration'}),
                                            left_on=['city','neighbourhood_cleansed'], 
                                            right_on=['city','neighbourhood_cleansed'], how='inner')
western_comments_listings_by_neighbourhood['type'] = 'Western'
western_comments_listings_by_neighbourhood['listings_penetration_rate'] = western_comments_listings_by_neighbourhood\
                                            .listings_penetration.apply(lambda x: x/
                                            western_comments_listings_by_neighbourhood.listings_penetration.max())
western_comments_listings_by_neighbourhood['comments_penetration_rate'] = western_comments_listings_by_neighbourhood\
                                            .comments_penetration.apply(lambda x: x/
                                            western_comments_listings_by_neighbourhood.comments_penetration.max())
western_comments_listings_by_neighbourhood.head()
Out[87]:
city neighbourhood_cleansed comments_penetration listings_penetration type listings_penetration_rate comments_penetration_rate
0 London Barking and Dagenham 2981 396 Western 0.040711 0.012615
1 London Barnet 18720 1688 Western 0.173538 0.079217
2 London Bexley 1932 269 Western 0.027655 0.008176
3 London Brent 46024 2608 Western 0.268120 0.194758
4 London Bromley 7382 649 Western 0.066721 0.031238
In [88]:
non_western_comments_listings_by_neighbourhood = pd.DataFrame.merge(non_western_comments.groupby(
                                            ['city','neighbourhood_cleansed'])['comments'].count().reset_index()\
                                            .rename(columns={'comments':'comments_penetration'}),
                                            listings_non_western_df.groupby(['city','neighbourhood_cleansed'])['id']\
                                            .count().reset_index().rename(columns={'id':'listings_penetration'}),
                                            left_on=['city','neighbourhood_cleansed'], 
                                            right_on=['city','neighbourhood_cleansed'], how='inner')
non_western_comments_listings_by_neighbourhood['type'] = 'Non Western'
non_western_comments_listings_by_neighbourhood['listings_penetration_rate'] = non_western_comments_listings_by_neighbourhood\
                                            .listings_penetration.apply(lambda x: x/
                                            non_western_comments_listings_by_neighbourhood.listings_penetration.max())
non_western_comments_listings_by_neighbourhood['comments_penetration_rate'] = non_western_comments_listings_by_neighbourhood\
                                            .comments_penetration.apply(lambda x: x/
                                            non_western_comments_listings_by_neighbourhood.comments_penetration.max())
non_western_comments_listings_by_neighbourhood.head()
Out[88]:
city neighbourhood_cleansed comments_penetration listings_penetration type listings_penetration_rate comments_penetration_rate
0 Beijing 东城区 30521 3734 Non Western 0.313966 0.109419
1 Beijing 丰台区 / Fengtai 10713 2464 Non Western 0.207181 0.038407
2 Beijing 大兴区 / Daxing 3281 1245 Non Western 0.104683 0.011763
3 Beijing 密云县 / Miyun 3389 1918 Non Western 0.161271 0.012150
4 Beijing 平谷区 / Pinggu 238 242 Non Western 0.020348 0.000853
Merge Western and Non Western sets to one data frame
In [89]:
all_comments_listings_by_neighbourhood = pd.concat([western_comments_listings_by_neighbourhood, 
                                                    non_western_comments_listings_by_neighbourhood]).reset_index(drop=True)
all_comments_listings_by_neighbourhood.head()
Out[89]:
city neighbourhood_cleansed comments_penetration listings_penetration type listings_penetration_rate comments_penetration_rate
0 London Barking and Dagenham 2981 396 Western 0.040711 0.012615
1 London Barnet 18720 1688 Western 0.173538 0.079217
2 London Bexley 1932 269 Western 0.027655 0.008176
3 London Brent 46024 2608 Western 0.268120 0.194758
4 London Bromley 7382 649 Western 0.066721 0.031238
In [90]:
all_comments_listings_by_neighbourhood.listings_penetration.describe()
Out[90]:
count      631.000000
mean       533.042789
std       1266.694319
min          1.000000
25%         15.000000
50%         85.000000
75%        394.000000
max      11893.000000
Name: listings_penetration, dtype: float64
In [91]:
all_comments_listings_by_neighbourhood.comments_penetration.describe()
Out[91]:
count       631.000000
mean       9079.698891
std       24450.439623
min           1.000000
25%         160.000000
50%        1032.000000
75%        6403.000000
max      278937.000000
Name: comments_penetration, dtype: float64

Scatter plot with Pearson Correlation between comments and listings penetration by neighbourhoods

In [98]:
def annotate(data, **kws):
    
    axes = plt.gca()
    pearson_corr = stats.pearsonr(data.comments_penetration, data.listings_penetration)    
    text = 'R: {:.2g}, p= {:.2g}'.format(pearson_corr[0], pearson_corr[1]) 

    axes.annotate(text, xy=(0.05,0.9), xycoords='axes fraction', ha='left', fontstyle='italic')

g = sns.FacetGrid(all_comments_listings_by_neighbourhood, col='type', col_wrap=5, height=3)
g.map_dataframe(sns.regplot, x='comments_penetration', y='listings_penetration')
g.map_dataframe(annotate)
g.set_titles(col_template='{col_name}')
g.set_axis_labels('Comments penetration', 'Listings penetration')

g.fig.tight_layout(pad=1)
plt.show(g);
In [100]:
g = sns.FacetGrid(all_comments_listings_by_neighbourhood[all_comments_listings_by_neighbourhood.type=='Non Western'],
                  col='city', col_wrap=5, height=3)
g.map_dataframe(sns.regplot, x='comments_penetration', y='listings_penetration')
g.map_dataframe(annotate)
g.set_titles(col_template='{col_name}')
g.set_axis_labels('Comments penetration', 'Listings penetration')

g.fig.tight_layout(pad=2)
plt.show(g);
In [99]:
g = sns.FacetGrid(all_comments_listings_by_neighbourhood[all_comments_listings_by_neighbourhood.type=='Western'], 
                  col='city', height=3)
g.map_dataframe(sns.regplot, x='comments_penetration', y='listings_penetration')
g.map_dataframe(annotate)
g.set_titles(col_template='{col_name}')
g.set_axis_labels('Comments penetration', 'Listings penetration')

g.fig.tight_layout(pad=2)
plt.show(g);

Gini index for Airbnb adoption by neighbourhood

Gini definition

In [163]:
def gini(array):
    """Calculate the Gini coefficient of a numpy array."""
    # based on bottom eq: http://www.statsdirect.com/help/content/image/stat0206_wmf.gif
    # from: http://www.statsdirect.com/help/default.htm#nonparametric_methods/gini.htm
    array = array.flatten() #all values are treated equally, arrays must be 1d
    if np.amin(array) < 0:
        array -= np.amin(array) #values cannot be negative
    array += 1 #values cannot be 0
    array = np.sort(array) #values must be sorted
    index = np.arange(1,array.shape[0]+1) #index per array element
    n = array.shape[0]#number of array elements
    return ((np.sum((2 * index - n  - 1) * array)) / (n * np.sum(array))) #Gini coefficient
In [164]:
def gini(x, w=None):
    # The rest of the code requires numpy arrays.
    x = np.asarray(x)
    if w is not None:
        w = np.asarray(w)
        sorted_indices = np.argsort(x)
        sorted_x = x[sorted_indices]
        sorted_w = w[sorted_indices]
        # Force float dtype to avoid overflows
        cumw = np.cumsum(sorted_w, dtype=float)
        cumxw = np.cumsum(sorted_x * sorted_w, dtype=float)
        return (np.sum(cumxw[1:] * cumw[:-1] - cumxw[:-1] * cumw[1:]) / 
                (cumxw[-1] * cumw[-1]))
    else:
        sorted_x = np.sort(x)
        n = len(x)
        cumx = np.cumsum(sorted_x, dtype=float)
        # The above formula, with all weights equal to 1 simplifies to:
        return (n + 1 - 2 * np.sum(cumx) / cumx[-1]) / n

Gini index by city

Preprocessing
Comments grouped by city and neighbourhoods English
In [165]:
western_neighbourhoods_grouped_by_city_english = western_df_english.groupby(
    ['city', 'neighbourhood_cleansed'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_english'})
In [166]:
non_western_neighbourhoods_grouped_by_city_english = non_western_df_english.groupby(['city', 'neighbourhood_cleansed'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_english'})
Comments grouped by city and neighbourhood non English
In [167]:
western_neighbourhoods_grouped_by_city_non_english = western_df_non_english.groupby(['city', 'neighbourhood_cleansed'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_non_english'})
In [168]:
non_western_neighbourhoods_grouped_by_city_non_english = non_western_df_non_english.groupby(['city', 'neighbourhood_cleansed'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_non_english'})
Gini for comments English
In [169]:
western_gini_by_city_english = western_neighbourhoods_grouped_by_city_english.groupby(
    ['city'])['comments_count_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_english':'gini'})
In [170]:
non_western_gini_by_city_english = non_western_neighbourhoods_grouped_by_city_english.groupby(
    ['city'])['comments_count_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_english':'gini'})
Gini for comments non English
In [171]:
western_gini_by_city_non_english = western_neighbourhoods_grouped_by_city_non_english.groupby(
    ['city'])['comments_count_non_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_non_english':'gini'})
In [172]:
non_western_gini_by_city_non_english = non_western_neighbourhoods_grouped_by_city_non_english.groupby(
    ['city'])['comments_count_non_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_non_english':'gini'})
Plot Gini index for English comments by city
In [173]:
fig, ax = plt.subplots(1, 2, figsize=(12, 5), sharey=True, constrained_layout=True)

ax1 = western_gini_by_city_english.sort_values(by='gini', ascending=False).plot(x = 'city', kind='bar', ax=ax[0])
ax1.axhline(western_gini_by_city_english.gini.mean(), color='r', linestyle='dashed', label='Western mean')
ax1.set_title('Gini index for Airbnb adoption by neighbourhood in Western coutries')
ax1.set_ylabel('Gini index')
ax1.legend(loc='best')
ax1.yaxis.grid(True)
ax1.xaxis.set_tick_params(rotation=45)

ax2 = non_western_gini_by_city_english.sort_values(by='gini', ascending=False).plot(x = 'city', kind='bar', ax=ax[1])
ax2.axhline(non_western_gini_by_city_english.gini.mean(), color='r', linestyle='dashed', label='non Western mean')
ax2.set_title('Gini index for Airbnb adoption by neighbourhood in non Western coutries')
ax2.legend(loc='best')
ax2.yaxis.grid(True)
ax2.xaxis.set_tick_params(rotation=45)

plt.tight_layout()
plt.show();
Plot Gini index for non English comments by city
In [174]:
fig, ax = plt.subplots(1, 2, figsize=(12, 5), sharey=True, constrained_layout=True)

ax1 = western_gini_by_city_non_english.sort_values(by='gini', ascending=False).plot(x = 'city', kind='bar', ax=ax[0])
ax1.axhline(western_gini_by_city_non_english.gini.mean(), color='r', linestyle='dashed', label='Western mean')
ax1.set_title('Gini index for Airbnb adoption by neighbourhood in Western coutries')
ax1.set_ylabel('Gini index')
ax1.legend(loc='best')
ax1.yaxis.grid(True)
ax1.xaxis.set_tick_params(rotation=45)

ax2 = non_western_gini_by_city_non_english.sort_values(by='gini', ascending=False).plot(x = 'city', kind='bar', ax=ax[1])
ax2.axhline(non_western_gini_by_city_non_english.gini.mean(), color='r', linestyle='dashed', label='non Western mean')
ax2.set_title('Gini index for Airbnb adoption by neighbourhood in non Western coutries')
ax2.legend(loc='best')
ax2.yaxis.grid(True)
ax2.xaxis.set_tick_params(rotation=45)

plt.tight_layout()
plt.show();

Gini index by city and year

Preprocessing
Comments grouped by year
In [108]:
western_df_english['year'] = pd.DatetimeIndex(western_df_english['date']).year
non_western_df_english['year'] = pd.DatetimeIndex(non_western_df_english['date']).year
western_df_non_english['year'] = pd.DatetimeIndex(western_df_non_english['date']).year
non_western_df_non_english['year'] = pd.DatetimeIndex(non_western_df_non_english['date']).year
Comments grouped by city by year English
In [119]:
western_neighbourhoods_grouped_by_city_by_year_english = western_df_english.groupby(
    ['city', 'neighbourhood_cleansed', 'year'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_english'})
In [120]:
non_western_neighbourhoods_grouped_by_city_by_year_english = non_western_df_english.groupby(
    ['city', 'neighbourhood_cleansed', 'year'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_english'})
In [121]:
western_gini_by_city_by_year_english = western_neighbourhoods_grouped_by_city_by_year_english.groupby(
    ['city', 'year'])['comments_count_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_english':'gini'})
In [122]:
non_western_gini_by_city_by_year_english = non_western_neighbourhoods_grouped_by_city_by_year_english.groupby(
    ['city', 'year'])['comments_count_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_english':'gini'})
Comments grouped by city by year non English
In [123]:
western_neighbourhoods_grouped_by_city_by_year_non_english = western_df_non_english.groupby(
    ['city', 'neighbourhood_cleansed', 'year'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_non_english'})
In [124]:
non_western_neighbourhoods_grouped_by_city_by_year_non_english = non_western_df_non_english.groupby(
    ['city', 'neighbourhood_cleansed', 'year'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_non_english'})
In [125]:
western_gini_by_city_by_year_non_english = western_neighbourhoods_grouped_by_city_by_year_non_english.groupby(
    ['city', 'year'])['comments_count_non_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_non_english':'gini'})
In [126]:
non_western_gini_by_city_by_year_non_english = non_western_neighbourhoods_grouped_by_city_by_year_non_english.groupby(
    ['city', 'year'])['comments_count_non_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_non_english':'gini'})
Plot Gini for English reviews by city and year
In [169]:
fig, ax = plt.subplots(2, 1, figsize=(10, 7), sharey=True, sharex=True)

ax1 = western_gini_by_city_by_year_english.pivot('year', 'city', 'gini').plot(kind='line', ax=ax[0])
ax1.set_title('Gini index for Airbnb adoption by neighbourhood in Western countries by year')
ax1.set_ylabel('Gini index')
ax1.legend(loc='best')
ax1.yaxis.grid(True)

ax2 = non_western_gini_by_city_by_year_english.pivot('year', 'city', 'gini').plot(kind='line', ax=ax[1])
ax2.set_title('Gini index for Airbnb adoption by neighbourhood in non Western countries by year')
ax2.set_ylabel('Gini index')
ax2.legend(loc='best')
ax2.yaxis.grid(True)

plt.tight_layout()
plt.show();
Plot Gini index for non English reviews by city and year
In [168]:
fig, ax = plt.subplots(2, 1, figsize=(10, 7), sharey=True, sharex=True)

ax1 = western_gini_by_city_by_year_non_english.pivot('year', 'city', 'gini').plot(kind='line', ax=ax[0])
ax1.set_title('Gini index for Airbnb adoption by neighbourhood in Western countries by year')
ax1.set_ylabel('Gini index')
ax1.legend(loc='best')
ax1.yaxis.grid(True)

ax2 = non_western_gini_by_city_by_year_non_english.pivot('year', 'city', 'gini').plot(kind='line', ax=ax[1])
ax2.set_title('Gini index for Airbnb adoption by neighbourhood in non Western countries by year')
ax2.set_ylabel('Gini index')
ax2.legend(loc='best')
ax2.yaxis.grid(True)

plt.tight_layout()
plt.show();
Plot Gini index by city and year English
In [134]:
fig, ax = plt.subplots(2, 1, figsize=(10, 10), sharey=True)
ax1 = western_gini_by_city_by_year_english.pivot('city', 'year', 'gini').plot(kind='bar', ax=ax[0])
ax1.set_title('Gini index for Airbnb adoption by neighbourhood in Western countries by year')
ax1.set_ylabel('Gini index')
ax1.yaxis.grid(True)
ax1.xaxis.set_tick_params(rotation=45)

ax2 = non_western_gini_by_city_by_year_english.pivot('city', 'year', 'gini').plot(kind='bar', ax=ax[1])
ax2.set_title('Gini index for Airbnb adoption by neighbourhood in non Western countries by year')
ax2.set_ylabel('Gini index')
ax2.yaxis.grid(True)
ax2.xaxis.set_tick_params(rotation=45)

plt.tight_layout()
plt.show();
Plot Gini index by city and year non English
In [137]:
fig, ax = plt.subplots(2, 1, figsize=(10, 10), sharey=True)
ax1 = western_gini_by_city_by_year_non_english.pivot('city', 'year', 'gini').plot(kind='bar', ax=ax[0])
ax1.set_title('Gini index for Airbnb adoption by neighbourhood in Western countries by year')
ax1.set_ylabel('Gini index')
ax1.yaxis.grid(True)
ax1.xaxis.set_tick_params(rotation=45)

ax2 = non_western_gini_by_city_by_year_non_english.pivot('city', 'year', 'gini').plot(kind='bar', ax=ax[1])
ax2.set_title('Gini index for Airbnb adoption by neighbourhood in non Western countries by year')
ax2.set_ylabel('Gini index')
ax2.yaxis.grid(True)
ax2.xaxis.set_tick_params(rotation=45)

plt.tight_layout()
plt.show();

Host adoption analysis

Gini index by city

Preprocessing

English comments
In [177]:
western_host_grouped_by_city_english = western_df_english.groupby(
    ['city', 'host_id'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_english'})
In [178]:
western_host_gini_by_city_english = western_host_grouped_by_city_english.groupby(
    ['city'])['comments_count_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_english':'gini'})
In [179]:
non_western_host_grouped_by_city_english = non_western_df_english.groupby(
    ['city', 'host_id'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_english'})
In [180]:
non_western_host_gini_by_city_english = non_western_host_grouped_by_city_english.groupby(
    ['city'])['comments_count_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_english':'gini'})
non English
In [181]:
western_host_grouped_by_city_non_english = western_df_non_english.groupby(
    ['city', 'host_id'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_non_english'})
In [182]:
western_host_gini_by_city_non_english = western_host_grouped_by_city_non_english.groupby(
    ['city'])['comments_count_non_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_non_english':'gini'})
In [183]:
non_western_host_grouped_by_city_non_english = non_western_df_non_english.groupby(
    ['city', 'host_id'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_non_english'})
In [184]:
non_western_host_gini_by_city_non_english = non_western_host_grouped_by_city_non_english.groupby(
    ['city'])['comments_count_non_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_non_english':'gini'})

Plot Gini index by city English

In [185]:
fig, ax = plt.subplots(1, 2, figsize=(10, 5), sharey=True)

ax1 = western_host_gini_by_city_english.sort_values(by='gini', ascending=False).plot(x = 'city', kind='bar', ax=ax[0])
ax1.axhline(western_host_gini_by_city_english.gini.mean(), color='r', linestyle='dashed', label='Western mean')
ax1.set_title('Gini index for Airbnb adoption by host in Western coutries')
ax1.set_ylabel('Gini index')
ax1.legend(loc='best')
ax1.yaxis.grid(True)
ax1.xaxis.set_tick_params(rotation=45)

ax2 = non_western_host_gini_by_city_english.sort_values(by='gini', ascending=False).plot(x = 'city', kind='bar', ax=ax[1])
ax2.axhline(non_western_host_gini_by_city_english.gini.mean(), color='r', linestyle='dashed', label='non Western mean')
ax2.set_title('Gini index for Airbnb adoption by host in non Western coutries')
ax2.legend(loc='best')
ax2.yaxis.grid(True)
ax2.xaxis.set_tick_params(rotation=45)

plt.tight_layout()
plt.show();

Plot Gini index by city non English

In [186]:
fig, ax = plt.subplots(1, 2, figsize=(10, 5), sharey=True)

ax1 = western_host_gini_by_city_non_english.sort_values(by='gini', ascending=False).plot(x = 'city', kind='bar', ax=ax[0])
ax1.axhline(western_host_gini_by_city_non_english.gini.mean(), color='r', linestyle='dashed', label='Western mean')
ax1.set_title('Gini index for Airbnb adoption by host in Western coutries')
ax1.set_ylabel('Gini index')
ax1.legend(loc='best')
ax1.yaxis.grid(True)
ax1.xaxis.set_tick_params(rotation=45)

ax2 = non_western_host_gini_by_city_non_english.sort_values(by='gini', ascending=False).plot(x = 'city', kind='bar', ax=ax[1])
ax2.axhline(non_western_host_gini_by_city_non_english.gini.mean(), color='r', linestyle='dashed', label='non Western mean')
ax2.set_title('Gini index for Airbnb adoption by host in non Western coutries')
ax2.legend(loc='best')
ax2.yaxis.grid(True)
ax2.xaxis.set_tick_params(rotation=45)

plt.tight_layout()
plt.show();

Gini index by city and year

Preprocessing

by year English
In [154]:
western_host_grouped_by_city_by_year_english = western_df_english.groupby(
    ['city', 'host_id', 'year'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_english'})
In [155]:
western_host_gini_by_city_by_year_english = western_host_grouped_by_city_by_year_english.groupby(
    ['city', 'year'])['comments_count_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_english':'gini'})
In [156]:
non_western_host_grouped_by_city_by_year_english = non_western_df_english.groupby(
    ['city', 'host_id', 'year'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_english'})
In [157]:
non_western_host_gini_by_city_by_year_english = non_western_host_grouped_by_city_by_year_english.groupby(
    ['city', 'year'])['comments_count_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_english':'gini'})
by year non English
In [158]:
western_host_grouped_by_city_by_year_non_english = western_df_non_english.groupby(
    ['city', 'host_id', 'year'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_non_english'})
In [159]:
western_host_gini_by_city_by_year_non_english = western_host_grouped_by_city_by_year_non_english.groupby(
    ['city', 'year'])['comments_count_non_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_non_english':'gini'})
In [160]:
non_western_host_grouped_by_city_by_year_non_english = non_western_df_non_english.groupby(
    ['city', 'host_id', 'year'])['comments'].count().reset_index().rename(columns={'comments':'comments_count_non_english'})
In [161]:
non_western_host_gini_by_city_by_year_non_english = non_western_host_grouped_by_city_by_year_non_english.groupby(
    ['city', 'year'])['comments_count_non_english'].apply(lambda x: gini(x.values)).reset_index().rename(columns={'comments_count_non_english':'gini'})

Plot Gini index by city and year English

In [170]:
fig, ax = plt.subplots(2, 1, figsize=(10, 7), sharey=True, sharex=True)

ax1 = western_host_gini_by_city_by_year_english.pivot('year', 'city', 'gini').plot(kind='line', ax=ax[0])
ax1.set_title('Gini index for Airbnb adoption by host in Western countries by year')
ax1.set_ylabel('Gini index')
ax1.legend(loc='best')
ax1.yaxis.grid(True)


ax2 = non_western_host_gini_by_city_by_year_english.pivot('year', 'city', 'gini').plot(kind='line', ax=ax[1])
ax2.set_title('Gini index for Airbnb adoption by host in non Western countries by year')
ax2.set_ylabel('Gini index')
ax2.legend(loc='best')
ax2.yaxis.grid(True)

plt.tight_layout()
plt.show();

Plot Gini index by city and year non English

In [171]:
fig, ax = plt.subplots(2, 1, figsize=(10, 7), sharey=True, sharex=True)

ax1 = western_host_gini_by_city_by_year_non_english.pivot('year', 'city', 'gini').plot(kind='line', ax=ax[0])
ax1.set_title('Gini index for Airbnb adoption by host in Western countries by year')
ax1.set_ylabel('Gini index')
ax1.legend(loc='best')
ax1.yaxis.grid(True)

ax2 = non_western_host_gini_by_city_by_year_non_english.pivot('year', 'city', 'gini').plot(kind='line', ax=ax[1])
ax2.set_title('Gini index for Airbnb adoption by host in non Western countries by year')
ax2.set_ylabel('Gini index')
ax2.legend(loc='best')
ax2.yaxis.grid(True)

plt.tight_layout();

Plot Gini index by city and year English

In [173]:
fig, ax = plt.subplots(2, 1, figsize=(10, 10), sharey=True)
ax1 = western_host_gini_by_city_by_year_english.pivot('city', 'year', 'gini').plot(kind='bar', ax=ax[0])
ax1.set_title('Gini index for Airbnb adoption by host in Western countries by year')
ax1.set_ylabel('Gini index')
ax1.yaxis.grid(True)
ax1.xaxis.set_tick_params(rotation=45)

ax2 = non_western_host_gini_by_city_by_year_english.pivot('city', 'year', 'gini').plot(kind='bar', ax=ax[1])
ax2.set_title('Gini index for Airbnb adoption by host in non Western countries by year')
ax2.set_ylabel('Gini index')
ax2.yaxis.grid(True)
ax2.xaxis.set_tick_params(rotation=45)

plt.tight_layout();

Plot Gini index by city and year non English

In [174]:
fig, ax = plt.subplots(2, 1, figsize=(10, 10), sharey=True)
ax1 = western_host_gini_by_city_by_year_non_english.pivot('city', 'year', 'gini').plot(kind='bar', ax=ax[0])
ax1.set_title('Gini index for Airbnb adoption by host in Western countries by year')
ax1.set_ylabel('Gini index')
ax2.yaxis.grid(True)
ax2.xaxis.set_tick_params(rotation=45)

ax2 = non_western_host_gini_by_city_by_year_non_english.pivot('city', 'year', 'gini').plot(kind='bar', ax=ax[1])
ax2.set_title('Gini index for Airbnb adoption by host in non Western countries by year')
ax2.set_ylabel('Gini index')
ax2.yaxis.grid(True)
ax2.xaxis.set_tick_params(rotation=45)

plt.tight_layout();
In [ ]: